AITopics | least-square problem

We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT). While current randomized solvers for least-squares optimization prescribe an embedding dimension at least greater than the data dimension, we show that the embedding dimension can be reduced to the effective dimension of the optimization problem, and still preserve high-probability convergence guarantees. In this regard, we derive sharp matrix deviation inequalities over ellipsoids for both Gaussian and SRHT embeddings. Specifically, we improve on the constant of a classical Gaussian concentration bound whereas, for SRHT embeddings, our deviation inequality involves a novel technical approach. Leveraging these bounds, we are able to design a practical and adaptive algorithm which does not require to know the effective dimension beforehand. Our method starts with an initial embedding dimension equal to 1 and, over iterations, increases the embedding dimension up to the effective one at most. Hence, our algorithm improves the state-of-the-art computational complexity for solving regularized least-squares problems. Further, we show numerically that it outperforms standard iterative solvers such as the conjugate gradient method and its pre-conditioned version on several standard machine learning datasets.

effective dimension adaptive sketching method, faster regularized least-square optimization, name change, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Low-Rank Tucker Decomposition of Large Tensors Using TensorSketch

Osman Asif Malik, Stephen Becker

Neural Information Processing SystemsNov-20-2025, 16:03:49 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Africa > Senegal > Kolda Region > Kolda (0.05)
North America > United States > Colorado > Boulder County > Boulder (0.05)
North America > United States > Ohio > Franklin County > Columbus (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)

Add feedback

Matrix-Free Least Squares Solvers: Values, Gradients, and What to Do With Them

Roy, Hrittik, Hauberg, Søren, Krämer, Nicholas

arXiv.org Artificial IntelligenceOct-23-2025

This paper argues that the method of least squares has significant unfulfilled potential in modern machine learning, far beyond merely being a tool for fitting linear models. To release its potential, we derive custom gradients that transform the solver into a differentiable operator, like a neural network layer, enabling many diverse applications. Empirically, we demonstrate: (i) scalability by enforcing weight sparsity on a 50 million parameter model; (ii) imposing conservativeness constraints in score-based generative models; and (iii) hyperparameter tuning of Gaussian processes based on predictive performance. By doing this, our work represents the next iteration in developing differentiable linear-algebra tools and making them widely accessible to machine learning practitioners.

artificial intelligence, constraint, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.19634

Country:

North America (0.28)
Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

0cb310ed8121549488fea8e8c2056096-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 02:49:58 GMT

dense update, equation, neural network, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
Europe > Austria (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

Function Spaces Without Kernels: Learning Compact Hilbert Space Representations

Low, Su Ann, Rommel, Quentin, Miller, Kevin S., Thorpe, Adam J., Topcu, Ufuk

arXiv.org Artificial IntelligenceSep-26-2025

Function encoders are a recent technique that learn neural network basis functions to form compact, adaptive representations of Hilbert spaces of functions. We show that function encoders provide a principled connection to feature learning and kernel methods by defining a kernel through an inner product of the learned feature map. This kernel-theoretic perspective explains their ability to scale independently of dataset size while adapting to the intrinsic structure of data, and it enables kernel-style analysis of neural models. Building on this foundation, we develop two training algorithms that learn compact bases: a progressive training approach that constructively grows bases, and a train-then-prune approach that offers a computationally efficient alternative after training. Both approaches use principles from PCA to reveal the intrinsic dimension of the learned space. In parallel, we derive finite-sample generalization bounds using Rademacher complexity and PAC-Bayes techniques, providing inference time guarantees. We validate our approach on a polynomial benchmark with a known intrinsic dimension, and on nonlinear dynamical systems including a Van der Pol oscillator and a two-body orbital model, demonstrating that the same accuracy can be achieved with substantially fewer basis functions. This work suggests a path toward neural predictors with kernel-level guarantees, enabling adaptable models that are both efficient and principled at scale.

artificial intelligence, basis function, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.20605

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

Filters

Collaborating Authors

least-square problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Low-Rank Tucker Decomposition of Large Tensors Using TensorSketch

f3f1fa1e4348bfbebdeee8c80a04c3b9-Supplemental.pdf

ea6979872125d5acbac6068f186a0359-Paper.pdf

e105b88b3e1ac23ec811a708cd7edebf-Paper.pdf

0cb310ed8121549488fea8e8c2056096-Paper-Conference.pdf

Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization

Low-Rank Tucker Decomposition of Large Tensors Using TensorSketch

Matrix-Free Least Squares Solvers: Values, Gradients, and What to Do With Them

0cb310ed8121549488fea8e8c2056096-Paper-Conference.pdf

Function Spaces Without Kernels: Learning Compact Hilbert Space Representations